A Fuzzy R Code Similarity Detection Algorithm
نویسندگان
چکیده
R is a programming language and software environment for performing statistical computations and applying data analysis that increasingly gains popularity among practitioners and scientists. In this paper we present a preliminary version of a system to detect pairs of similar R code blocks among a given set of routines, which bases on a proper aggregation of the output of three different [0, 1]-valued (fuzzy) proximity degree estimation algorithms. Its analysis on empirical data indicates that the system may in future be successfully applied in practice in order e.g. to detect plagiarism among students’ homework submissions or to perform an analysis of code recycling or code cloning in R’s open source packages repositories.
منابع مشابه
Fraud Detection of Credit Cards Using Neuro-fuzzy Approach Based on TLBO and PSO Algorithms
The aim of this paper is to detect bank credit cards related frauds. The large amount of data and their similarity lead to a time consuming and low accurate separation of healthy and unhealthy samples behavior, by using traditional classifications. Therefore in this study, the Adaptive Neuro-Fuzzy Inference System (ANFIS) is used in order to reach a more efficient and accurate algorithm. By com...
متن کاملA Fuzzy Logic Approach to Computer Software Source Code Authorship Analysis
Software source code authorship analysis has become an important area in recent years with promising applications in both the legal sector (such as proof of ownership and software forensics) and the education sector (such as plagiarism detection and assessing style). Authorship analysis encompasses the sub-areas of author discrimination, author characterization, and similarity detection (also r...
متن کاملFuzzy Automatic Detection of Landmines from Sensors Data
In this paper, two fuzzy algorithms for automatic decision making for antipersonnel landmine detection form sensors data. The first is a “feature in-decision out” fuzzy fusion algorithm for two sensors measurements, namely a ground penetrating radar (GPR) and a metal detector (MD). The inputs to the fuzzy fusion algorithm are features extracted from both GPR and MD measurements while the output...
متن کاملA Fuzzy Approach to Performance Evaluation of Edge Detectors
The system PICASSO 2 represents the latest version of software package, designed for a comparative evaluation of image processing algorithms. In this paper we discuss the part of the system which evaluates edge detectors and consider its fuzzy extension. Namely, we introduce the ground truth edge maps defined in the fuzzy way and complete the system with several fuzzy similarity measures. The p...
متن کاملAutomatic thematic categorization of documents using a fuzzy taxonomy and fuzzy hierarchical clustering
In this paper we formally define the problem of automatic detection of thematic categories in a semantically indexed document, and identify the main obstacles to overcome in this process. Furthermore, we explain how detection of thematic categories can be achieved, with the use of a fuzzy quasi-taxonomic relation. Our approach relies on a fuzzy hierarchical clustering algorithm; this algorithm ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014